export const description =
  'Comprehensive guide to scaling Robyn applications from development to production, including multi-process configuration, load balancing, and deployment strategies.'

# Scaling and Production Deployment

Robyn is designed to scale efficiently from development to production. This guide covers scaling strategies, production deployment patterns, and performance optimization techniques.

## Understanding Robyn's Scaling Model

### Multi-Process Architecture

Robyn uses a **shared-nothing multi-process model** optimized for modern multi-core systems. Each process:
- Runs independently with its own Python interpreter and memory space
- Has its own Global Interpreter Lock (GIL), eliminating GIL contention
- Can handle multiple concurrent requests via worker threads
- Scales linearly across CPU cores for true parallelism
- Provides fault isolation - one process crash doesn't affect others

### Worker Threads

Within each process, multiple worker threads provide concurrency:
- Share the same Python interpreter and memory space
- Are subject to the GIL for CPU-bound tasks (use more processes for CPU work)
- Excel at I/O-bound operations where threads can release the GIL
- Handle database calls, HTTP requests, file operations efficiently
- Provide request-level concurrency within the same process

## Scaling Configuration

### Basic Scaling

<Row>
  <Col>
    **Single Process Development**: Start simple during development with auto-reload enabled.
  </Col>
  <Col sticky>
    <CodeGroup title="Development Scaling">
    ```bash
    # Development mode (single process, single worker)
    python app.py --dev
    
    # Development with custom port
    python app.py --dev --port 3000
    ```
    </CodeGroup>
  </Col>
</Row>

<Row>
  <Col>
    **Multi-Core Production**: Scale across all available CPU cores for maximum performance.
  </Col>
  <Col sticky>
    <CodeGroup title="Production Scaling">
    ```bash
    # Automatic optimization (recommended)
    python app.py --fast
    
    # Manual configuration for 8-core system
    python app.py --processes 8 --workers 2
    
    # I/O heavy workloads
    python app.py --processes 4 --workers 4
    
    # CPU heavy workloads  
    python app.py --processes 8 --workers 1
    ```
    </CodeGroup>
  </Col>
</Row>

### Advanced Configuration

<Row>
  <Col>
    **Hardware-Specific Tuning**: Optimize based on your specific hardware and application characteristics.
  </Col>
  <Col sticky>
    <CodeGroup title="Advanced Tuning">
    ```bash
    # High-traffic web API (4-core system)
    python app.py --processes 4 --workers 3 --log-level INFO
    
    # Data processing service (8-core system)
    python app.py --processes 8 --workers 1 --log-level WARNING
    
    # Mixed workload (balanced approach)
    python app.py --processes 6 --workers 2 --log-level INFO
    
    # Maximum concurrency (16-core system)
    python app.py --processes 8 --workers 4
    ```
    </CodeGroup>
  </Col>
</Row>

### Configuration Guidelines

<Row>
  <Col>
    **Choosing the Right Configuration**: Use these guidelines to optimize for your specific use case.
  </Col>
  <Col sticky>
    <CodeGroup title="Configuration Guidelines">
    ```python
    # For CPU-intensive applications
    # Rule: processes = CPU cores, workers = 1
    # Example: 8-core machine
    python app.py --processes 8 --workers 1
    
    # For I/O-intensive applications  
    # Rule: processes = CPU cores / 2, workers = 2-4
    # Example: 8-core machine
    python app.py --processes 4 --workers 3
    
    # For balanced applications
    # Rule: processes = CPU cores / 2, workers = 2
    # Example: 8-core machine
    python app.py --processes 4 --workers 2
    
    # Memory considerations
    # Each process uses ~50-100MB base memory
    # Monitor with: ps aux | grep python
    ```
    </CodeGroup>
  </Col>
</Row>

## Production Deployment Strategies

### Container Deployment

<Row>
  <Col>
    **Docker Containerization**: Deploy Robyn applications using Docker for consistent environments and easy scaling.
  </Col>
  <Col sticky>
    <CodeGroup title="Docker Deployment">
    ```dockerfile
    # Dockerfile
    FROM python:3.11-slim
    
    # Install system dependencies
    RUN apt-get update && apt-get install -y \
        build-essential \
        && rm -rf /var/lib/apt/lists/*
    
    # Set working directory
    WORKDIR /app
    
    # Copy requirements and install dependencies
    COPY requirements.txt .
    RUN pip install --no-cache-dir -r requirements.txt
    
    # Copy application code
    COPY . .
    
    # Expose port
    EXPOSE 8080
    
    # Health check
    HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
        CMD curl -f http://localhost:8080/health || exit 1
    
    # Run application with optimized settings
    CMD ["python", "app.py", "--fast", "--host", "0.0.0.0", "--port", "8080"]
    ```
    
    ```yaml
    # docker-compose.yml
    version: '3.8'
    services:
      robyn-app:
        build: .
        ports:
          - "8080:8080"
        environment:
          - ROBYN_HOST=0.0.0.0
          - ROBYN_PORT=8080
          - ROBYN_LOG_LEVEL=INFO
        restart: unless-stopped
        deploy:
          resources:
            limits:
              cpus: '2.0'
              memory: 1G
            reservations:
              cpus: '1.0'
              memory: 512M
    ```
    </CodeGroup>
  </Col>
</Row>

### Load Balancer Configuration

<Row>
  <Col>
    **Nginx Load Balancing**: Use Nginx as a reverse proxy and load balancer for multiple Robyn instances.
  </Col>
  <Col sticky>
    <CodeGroup title="Nginx Configuration">
    ```nginx
    # /etc/nginx/sites-available/robyn-app
    upstream robyn_backend {
        least_conn;
        server 127.0.0.1:8080 max_fails=3 fail_timeout=30s;
        server 127.0.0.1:8081 max_fails=3 fail_timeout=30s;
        server 127.0.0.1:8082 max_fails=3 fail_timeout=30s;
        server 127.0.0.1:8083 max_fails=3 fail_timeout=30s;
    }
    
    server {
        listen 80;
        server_name yourdomain.com;
        
        # Security headers
        add_header X-Frame-Options DENY;
        add_header X-Content-Type-Options nosniff;
        add_header X-XSS-Protection "1; mode=block";
        
        # Gzip compression
        gzip on;
        gzip_types text/plain application/json application/javascript text/css;
        
        # Static files (if any)
        location /static/ {
            alias /var/www/static/;
            expires 1y;
            add_header Cache-Control "public, immutable";
        }
        
        # API routes
        location / {
            proxy_pass http://robyn_backend;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            
            # Timeouts
            proxy_connect_timeout 5s;
            proxy_send_timeout 60s;
            proxy_read_timeout 60s;
            
            # Health checks
            proxy_next_upstream error timeout http_500 http_502 http_503;
        }
        
        # Health check endpoint
        location /health {
            access_log off;
            proxy_pass http://robyn_backend;
        }
    }
    ```
    </CodeGroup>
  </Col>
</Row>

### Kubernetes Deployment

<Row>
  <Col>
    **Kubernetes Orchestration**: Deploy and scale Robyn applications in Kubernetes clusters.
  </Col>
  <Col sticky>
    <CodeGroup title="Kubernetes Manifests">
    ```yaml
    # deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: robyn-app
      labels:
        app: robyn-app
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: robyn-app
      template:
        metadata:
          labels:
            app: robyn-app
        spec:
          containers:
          - name: robyn-app
            image: your-registry/robyn-app:latest
            ports:
            - containerPort: 8080
            env:
            - name: ROBYN_HOST
              value: "0.0.0.0"
            - name: ROBYN_PORT
              value: "8080"
            - name: ROBYN_LOG_LEVEL
              value: "INFO"
            resources:
              requests:
                memory: "256Mi"
                cpu: "250m"
              limits:
                memory: "512Mi"
                cpu: "500m"
            livenessProbe:
              httpGet:
                path: /health
                port: 8080
              initialDelaySeconds: 30
              periodSeconds: 10
            readinessProbe:
              httpGet:
                path: /health
                port: 8080
              initialDelaySeconds: 5
              periodSeconds: 5
    
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: robyn-service
    spec:
      selector:
        app: robyn-app
      ports:
      - protocol: TCP
        port: 80
        targetPort: 8080
      type: LoadBalancer
    
    ---
    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: robyn-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: robyn-app
      minReplicas: 3
      maxReplicas: 10
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 70
      - type: Resource
        resource:
          name: memory
          target:
            type: Utilization
            averageUtilization: 80
    ```
    </CodeGroup>
  </Col>
</Row>

## Monitoring and Observability

### Application Metrics

<Row>
  <Col>
    **Built-in Monitoring**: Add comprehensive monitoring and metrics collection to your Robyn application for production observability.
  </Col>
  <Col sticky>
    <CodeGroup title="Advanced Monitoring Setup">
    ```python
    from robyn import Robyn
    import time
    import psutil
    import logging
    import threading
    from collections import defaultdict, deque
    
    app = Robyn(__file__)
    
    # Configure structured logging
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
        handlers=[
            logging.StreamHandler(),
            logging.FileHandler('app.log')
        ]
    )
    logger = logging.getLogger(__name__)
    
    # Advanced metrics collection
    class MetricsCollector:
        def __init__(self):
            self.request_count = 0
            self.total_response_time = 0
            self.status_codes = defaultdict(int)
            self.endpoint_stats = defaultdict(lambda: {"count": 0, "total_time": 0})
            self.response_times = deque(maxlen=1000)  # Last 1000 requests
            self.lock = threading.Lock()
        
        def record_request(self, method, path, status_code, duration):
            with self.lock:
                self.request_count += 1
                self.total_response_time += duration
                self.status_codes[status_code] += 1
                endpoint_key = f"{method} {path}"
                self.endpoint_stats[endpoint_key]["count"] += 1
                self.endpoint_stats[endpoint_key]["total_time"] += duration
                self.response_times.append(duration)
        
        def get_metrics(self):
            with self.lock:
                avg_response_time = self.total_response_time / max(self.request_count, 1)
                p95_response_time = sorted(self.response_times)[int(len(self.response_times) * 0.95)] if self.response_times else 0
                
                return {
                    "requests_total": self.request_count,
                    "avg_response_time": avg_response_time,
                    "p95_response_time": p95_response_time,
                    "status_codes": dict(self.status_codes),
                    "top_endpoints": dict(sorted(
                        self.endpoint_stats.items(),
                        key=lambda x: x[1]["count"],
                        reverse=True
                    )[:10])
                }
    
    metrics = MetricsCollector()
    start_time = time.time()
    
    @app.before_request
    def monitor_request_start(request):
        request.start_time = time.time()
        return request
    
    @app.after_request
    def monitor_request_end(request, response):
        duration = time.time() - request.start_time
        status_code = getattr(response, 'status_code', 200)
        
        # Record metrics
        metrics.record_request(request.method, request.url.path, status_code, duration)
        
        # Log slow requests
        if duration > 1.0:
            logger.warning(
                f"Slow request: {request.method} {request.url.path} - "
                f"{duration:.3f}s (status: {status_code})"
            )
        
        # Add response headers
        response.headers["X-Response-Time"] = f"{duration:.3f}s"
        response.headers["X-Request-ID"] = str(time.time_ns())
        return response
    
    # Health endpoint with detailed status
    @app.get("/health", const=True)
    def health_check():
        return {
            "status": "healthy",
            "version": "1.0.0",
            "uptime_seconds": time.time() - start_time
        }
    
    # Comprehensive metrics endpoint
    @app.get("/metrics")
    def get_metrics():
        cpu_percent = psutil.cpu_percent(interval=1)
        memory = psutil.virtual_memory()
        disk = psutil.disk_usage('/')
        
        app_metrics = metrics.get_metrics()
        
        return {
            "system": {
                "cpu_usage_percent": cpu_percent,
                "memory_usage_percent": memory.percent,
                "memory_available_mb": memory.available / 1024 / 1024,
                "disk_usage_percent": disk.percent,
                "load_average": psutil.getloadavg()
            },
            "application": app_metrics,
            "timestamp": time.time()
        }
    
    # Readiness endpoint for k8s
    @app.get("/ready")
    def readiness_check():
        # Add your readiness checks here (DB connection, etc.)
        return {"ready": True}
    ```
    </CodeGroup>
  </Col>
</Row>

## Performance Optimization

### Scaling Best Practices

1. **Start Simple**: Begin with `--fast` mode and measure performance
2. **Profile Your Application**: Identify CPU vs I/O bound operations
3. **Test Different Configurations**: Benchmark various process/worker combinations
4. **Monitor Resource Usage**: Watch CPU, memory, and I/O utilization
5. **Scale Horizontally**: Use multiple instances behind a load balancer

### Performance Testing

<Row>
  <Col>
    **Load Testing**: Use tools like wrk, Apache Bench, or Locust to test different configurations and find optimal settings.
  </Col>
  <Col sticky>
    <CodeGroup title="Performance Testing">
    ```bash
    # Install wrk for load testing
    # macOS: brew install wrk
    # Ubuntu: sudo apt install wrk
    
    # Test baseline performance
    wrk -t4 -c100 -d30s http://localhost:8080/health
    
    # Test with different configurations
    # Configuration 1: Single process
    python app.py --processes 1 --workers 1 &
    wrk -t4 -c100 -d30s http://localhost:8080/api/endpoint
    
    # Configuration 2: Multi-process
    python app.py --processes 4 --workers 2 &
    wrk -t4 -c100 -d30s http://localhost:8080/api/endpoint
    
    # Configuration 3: Fast mode
    python app.py --fast &
    wrk -t4 -c100 -d30s http://localhost:8080/api/endpoint
    
    # Compare results and choose the best configuration
    ```
    
    ```python
    # Python script for automated testing
    import subprocess
    import time
    import requests
    
    def test_configuration(processes, workers, duration=30):
        # Start server
        cmd = f"python app.py --processes {processes} --workers {workers}"
        server = subprocess.Popen(cmd.split())
        time.sleep(2)  # Wait for startup
        
        try:
            # Run load test
            wrk_cmd = f"wrk -t4 -c100 -d{duration}s http://localhost:8080/health"
            result = subprocess.run(wrk_cmd.split(), capture_output=True, text=True)
            return result.stdout
        finally:
            server.terminate()
            time.sleep(1)
    
    # Test different configurations
    configs = [(1, 1), (2, 2), (4, 1), (4, 2), (8, 1)]
    for processes, workers in configs:
        print(f"\nTesting: {processes} processes, {workers} workers")
        result = test_configuration(processes, workers)
        print(result)
    ```
    </CodeGroup>
  </Col>
</Row>

### Environment Variables

<Row>
  <Col>
    **Configuration via Environment**: Use environment variables for deployment flexibility.
  </Col>
  <Col sticky>
    <CodeGroup title="Environment Configuration">
    ```bash
    # Production environment variables
    export ROBYN_HOST=0.0.0.0
    export ROBYN_PORT=8080
    export ROBYN_LOG_LEVEL=INFO
    export ROBYN_PROCESSES=4
    export ROBYN_WORKERS=2
    
    # Database configuration
    export DATABASE_URL=postgresql://user:pass@localhost/db
    export REDIS_URL=redis://localhost:6379/0
    
    # Application settings
    export SECRET_KEY=your-secret-key
    export DEBUG=false
    export ENVIRONMENT=production
    
    # Start application with environment config
    python app.py
    ```
    
    ```python
    # app.py - Using environment variables
    import os
    from robyn import Robyn
    
    app = Robyn(__file__)
    
    # Configuration from environment
    HOST = os.getenv("ROBYN_HOST", "127.0.0.1")
    PORT = int(os.getenv("ROBYN_PORT", "8080"))
    PROCESSES = int(os.getenv("ROBYN_PROCESSES", "1"))
    WORKERS = int(os.getenv("ROBYN_WORKERS", "1"))
    LOG_LEVEL = os.getenv("ROBYN_LOG_LEVEL", "INFO")
    
    if __name__ == "__main__":
        app.start(
            host=HOST,
            port=PORT,
            processes=PROCESSES,
            workers=WORKERS,
            log_level=LOG_LEVEL
        )
    ```
    </CodeGroup>
  </Col>
</Row>

---

## What's next?

Now, Batman wanted to extend Robyn. Robyn told him about the advanced features.

- [Advanced Features](/documentation/en/api_reference/advanced_features)

